Accurate Solubility Prediction with Error Bars for Electrolytes: A Machine Learning Approach

نویسندگان

  • Anton Schwaighofer
  • Timon Schroeter
  • Sebastian Mika
  • Julian Laub
  • Antonius ter Laak
  • Detlev Sülzle
  • Ursula Ganzer
  • Nikolaus Heinrich
  • Klaus-Robert Müller
چکیده

Accurate in silico models for predicting aqueous solubility are needed in drug design and discovery and many other areas of chemical research. We present a statistical modeling of aqueous solubility based on measured data, using a Gaussian Process nonlinear regression model (GPsol). We compare our results with those of 14 scientific studies and 6 commercial tools. This shows that the developed model achieves much higher accuracy than available commercial tools for the prediction of solubility of electrolytes. On top of the high accuracy, the proposed machine learning model also provides error bars for each individual prediction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investig...

متن کامل

Prediction of Satranidazole Solubility in Water-Polyethylene Glycol 400 Mixtures Using Extended Hildebrand Solubility Approach

       The Extended Hildebrand Solubility Parameter Approach (EHSA) is used to estimate the solubility of satranidazole in binary solvent systems. The solubility of satranidazole in various water-PEG 400 mixtures was analyzed in terms of solute-solvent interactions using a modified version of Hildebrand-Scatchard treatment for regular solutions. The solubility equation employs term interaction ...

متن کامل

Prediction of the pharmaceutical solubility in water and organic solvents via different soft computing models

Solubility data of solid in aqueous and different organic solvents are very important physicochemical properties considered in the design of the industrial processes and the theoretical studies. In this study, experimental solubility data of 666 pharmaceutical compounds in water and 712 pharmaceutical compounds in organic solvents were collected from different sources. Three different artificia...

متن کامل

Uniting Cheminformatics and Chemical Theory To Predict the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules

We present four models of solution free-energy prediction for druglike molecules utilizing cheminformatics descriptors and theoretically calculated thermodynamic values. We make predictions of solution free energy using physics-based theory alone and using machine learning/quantitative structure-property relationship (QSPR) models. We also develop machine learning models where the theoretical e...

متن کامل

Hypertension Prediction in Primary School Students Using an Ensemble Machine Learning Method

Introduction: The prevalence of hypertension in children is increasing, and this complication is considered the most important risk factor for cardiovascular diseases in older age. Early detection and control of hypertension can prevent its progress and reduce its consequences. Machine learning methods can help predict this complication promptly and reduce cost and time. This study aimed to pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 47 2  شماره 

صفحات  -

تاریخ انتشار 2007